A Bootstrap Evaluation of the Variation of Rainfall Sample Histograms Estimated from Finite Historical Records

نویسندگان

  • R. G. Addie
  • M. Roberts
چکیده

In this paper the error in estimating the conditional distribution of rainfall given a particular pre-existing or coexisting southern oscillation condition due to the limited length of the historical record is investigated and some statistical procedures for reducing this error are proposed and investigated. Introduction Rainfall in Australia appears to be influenced by the El-Nino/Southern Oscillation phenomenon. Exploiting this relationship for long-term forecasting is not entirely straightforward, however, because of the finite historical record on which such estimates have to be based. In this paper we propose a technique which helps to improve the quality estimates of the influence of the SOI on rainfall by aggregating results from nearby towns to reduce noise in the estimate. Any method which regards the historical record as a true indication of the longterm state of affairs ignores the important problem of sampling error. We investigate a procedure for estimating sampling error by means of a bootstrap procedure. This procedure amounts to repeatedly taking a random sample of years from the historical record and re-estimating the influence of the SOI on rainfall for each sample. The variation between these estimates can then be regarded as a first-cut estimate of the sampling error introduced by the historical sampling process. Techniques The El Nino/Southern Oscillation phenomenon can be measured by means of the Southern Oscillation Index (SOI), which is obtained by subtractingthe anomoly of the air pressure at Darwin from the anomoly of the air pressure in Tahiti, dividing this figure by its standard deviation (as estimated over a certain historical period) and multiplying by 10. Although this is a rather primitive way to summarise what must be a highly complex phenomenon, it is widely accepted as an effective indicator of the ENSO phenomenon. SOI Phases As a means of identifying a predictive relationship between the SOI and rainfall, Stone(1992) suggested that the instead of using the SOI record itself, that an SOI-related phase might be more useful, which captures the current state of the SOI and whether it is moving up or moving down. Five phases were defined: Phase 1 reflects very low SOI with little change, Phase 2, very high SOI with little change, Phase 3, rapidly falling SOI , Phase 4, rapidly rising and Phase 5, normal SOI with little change. Typically, this is used to estimate the influence of the SOI on rainfall by sifting the recorded rainfall at a location in a particular month, according to the corresponding SOI phase (correspondence may include a lag in time between the SOI phase observation and the rainfall observation). In predicting rainfall for this location and month, the rainfall distribution associated with this phase of the SOI, is used to give an indication of the likelihood of rain. This prediction is simply an observation of past frequency under specified conditions and relies on a relatively small sample size. Two basic assumptions are made in this method. One is, that there is no slow overall change to the weather overtime. The other is, that the sample used for the prediction is a true representative of the population of rainfall through all time. In this paper we investigate the second of these assumptions. Variations due to Sampling The data that we have to work with is derived from approximately 100 years of history. Thus for each month, and each rainfall observation site, there is up to about 100 sample points. Dividing these months into the different groups (or phases) of the SOI, means that we are working with distributions of approximately 20 sample points. The frequency of occurance of each phase changes over the months. In June, Phase 1 occurs only ten times in the entire recorded history of the SOI. In using the June SOI conditions to predict September rainfall in any location, only ten sample points can be used. The recorded history of September rainfall for Toowoomba when the SOI for June was in Phase 1, is shown in Table 1. Year 1888 '96 1905 '11 '12 '40 '46 '72 '77 '87 Rainfall (mm) 41 14 16 23 28 24 120 8 11 3 Table 1. Recorded Rainfall for Toowoomba in Phase 1 There are ten sample points (rain data) and we associate each one with a percentile. We could say for example that there is 10% chance of getting less than 3 mm total rainfall for September, with June phase 1, in Toowoomba. 3 mm of rainfall is allocated this percentile, because it is the smallest in our sample. If much longer records were available, this could turn out to be 15% , 8% or even 30%. Rainfall could be below 3 mm thirty percent of the time when the June SOI phase is 1, but we just happened to strike some of the wetter years for these conditions! To investigate the variance of the estimate of a histogram due to sampling in a sample of size twenty, for example, we could repeatedly choose twenty percentile values at random, sort them and then associate these percentiles with our rainfall sample points. This would represent another possible true distribution for our population. The results of this process, starting with a straight line histogram (which would be appropriate if we measureed rainfall in percentiles, for example) is illustrated in Figure 1. 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100 Figure 1. One hundred possible interpretations of a sample histogram of a rainfall of length 20. There is a test, called the Kolmogorov-Smirnov Goodness-of-Fit test, which gives specific bounds in which the true distribution lies with a certain amount of confidence. For a sample of size 20, the distribution function of the population from which these samples were taken, is 90% certain to be contained within a specific distance from the sample distribution. These bounds are also indicated in Figure 1. Aggregate Estimation of the Effect of the SOI on Rainfall We could perhaps increase the statistical significance of our results by developing a method which simultaneously estimates the effect of the SOI for all weather stations. The problem we must overcome in order to be able to do this is that the weather stations are all quite different. Rainfall distributions vary quite considerably from place to place because of geographical factors (primarily). One way to overcome this problem is to measure rainfall at each location in a manner which is "scale independent" -that is to say in units which are in some manner "universal". One such set of units is percentiles -i.e. the proportion of rainfall measurements, at the specific location in question, which are less than the measurement under consideration. If the effect of the SOI is broadly similar for all weather stations in the same region, when we measure it in these units, we should see a consistent influence. In order to check this hypothesis, the following experiment was carried out: all the rainfall data for a region was ``recalibrated'' -measured in local percentiles rather than in mm. These measurements were then aggregated over the region, by simple averaging. This produces a single rainfall record for a mythical place: Normville. The rainfall record for Normville has less variance than for other towns because it was produced by aggregation. We hope this advantage of having lower variance will produce better estimates of the relationship between the SOI phase at a certain time and the rainfall at an individual town at a specific time of the year. Plots of the rainfall measured in percentiles at Normville under certain conditions are shown in Figure 2. Unfortunately, Normville's rain is not measured in mm, but by percentiles, so we only know whether the rain is more than usual or less than usual, or how unusual, but not how much actually fell. In order to convert from percentiles back to mm, we must choose a town and use its sample histogram of rain inverted as a mapping from percentiles to rainfall. This will then provide estimates of the rainfall of this town, under the given conditions. Ident Jun Jul Aug Sep Oct Nov 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100 Consolidated Distribution of Rainfall (in percentiles), Lag 1, Phase 1 Figure 2. Estimates of SOI Influenced Rainfall Distribution Estimation of the Error The end result of the approach outlined above is a curve which looks very much like a sample distribution. The horizontal axis happens to be measured in ``percentiles'' instead of rainfall values, but in other respects this curve looks just like a sample distribution of rainfall for an individual town. We could perhaps describe our data analysis procedure by saying that we have summarised a large collection of rainfall data by producing figures for a ``typical place'', Normville say,and subsequent stages of analysis can be regarded as studies of Normville. If we interpret the curve in this way, the error in estimation of the distribution of rainfall at Normville might perhaps be no less than for any other town. For example, one way to estimate the estimation error of a sample distribution is to resample the sample, estimating the distribution from the resamples, and then compare these ``bootstrapped'' estimates to get an idea of how inaccurate the original estimate is likely to be. Another way was described earlier in connection with the Kolmogorov-Smirnov test. For the original towns, these two procedures should be consistent and equally appropriate. However this procedure is somewhat pessimistic as a method for estimating the estimation error of our aggregate estimate of the rainfall distibution at Normville because the estimate of the ``rainfall'' at Normville is itself obtained as an average. Here is an alternative bootstrap procedure which takes this fact into account. Let us resample the sample by taking a random sample of years (including duplicates). We then treat this resampled sample as if it were our entire data set and re-apply the procedure described above for estimating the rainfall distribution at Normville. We repeat this procedure many times (e.g. 100 times), using a different random sample each time. The collection of all the estimated distributions for the rainfall at Normville is then an indication of the estimation error for the original estimate of the distribution. This more careful approach to estimating the error in estimation of the relationship between SOI and rainfall has been carried out and the results are shown in Figure 3. Identity Samp1 Samp2 Samp3 Samp4 Samp5 0 20 40 60 80 100 0 10 20 30 40 50 60 70 80 90 100 Estimates of Normville Rainfall Distribution from resampled history, Lag 1, Phase 1

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Evaluation of short term rainfall analysis methods (Case Study: Shahrekord station)

Determination of maximum rivers flow resulted from surface runoff in watersheds without adequate hydrometric records is based on rainfall-runoff models. The short duration rainfalls are one of the basic elements to apply these models. The access to quantitative and intrinsic worth storm data is restricted around the word and in Iran too. There are several empirical and statistical methods to es...

متن کامل

Functional Analysis of Iranian Temperature and Precipitation by Using Functional Principal Components Analysis

Extended Abstract. When data are in the form of continuous functions, they may challenge classical methods of data analysis based on arguments in finite dimensional spaces, and therefore need theoretical justification. Infinite dimensionality of spaces that data belong to, leads to major statistical methodologies and new insights for analyzing them, which is called functional data analysis (FDA...

متن کامل

Analysis of Precipitation Climate and Evapotranspiration in Kerman of Iran

Rainfall and evapotranspiration are the two most popular climatic factors which have crucial function on agricultural production. Rainfall can be directly measured easily in an area but evapotranspiration is estimated from weather data. In this study reference evapotranspiration ETo was estimated using Penman–Monteith equation. Monthly rainfall and evapotarnspiration were plotted and compared i...

متن کامل

A 780-year annually resolved record of Indian Ocean monsoon precipitation from a speleothem from south Oman

[1] Meteorological records of monsoon rainfall in the Indian Ocean are generally less than 100 years long. The relative brevity of these records makes it difficult to investigate monsoon variation on decadal and centennial timescales, to determine what factors influence the intensity of rainfall on these timescales, or to place possible changes in the twentieth century into a broader historical...

متن کامل

تأثیر تغییرات مکانی بارندگی بر پیش‌بینی هیدروگراف سیلاب در حوضه‌های آبریز کوهستانی

In this study, the influence of spatial heterogeneity of rainfall on flood hydrograph prediction in three mountainous catchments in south west of Iran was studied. Two interpolation techniques including Thiessen polygons method and Inverse Distance Weighting method were applied to compare the rainfall patterns of surrounding rain-gages in hydrograph simulation with rainfall patterns of nearest ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1996